i.e. Inversions in Hilbert space / infinite dimensional Bayesian inference
One thing we'd (I'd) like in inverse problems in Earth Sciences is to encode the prior better. A way to do this for Gaussian knowledge is using Gaussian random fields (GRF). Gaussians are arguably one of the more popular priors, even if they are not always the best choice.
GRF's have a few advantages over discrete Gaussian priors which make them particularly interesting for inverse problems:
A typical discrete prior distribution has for every parameter $m_i$ a marginal distribution, and for the complete set of parameters $\mathbf{m}$ the total prior distribution. In contrast, a Gaussian random field (or process) is a continuous measure of knowledge on a function space. This means, it says something about a function $u(\mathbf{x})$ living on some domain $\Omega$. Effectively, we now invert in the Hilbert space on domain $\Omega$.
The GRF is characterised by $\mathcal{N}(\mu_0(\mathbf{x}), \mathcal{C})$. Here, $\mu_0(\mathbf{x})$ is the mean function, and $\mathcal{C}$ the covariance operator. The operator $\mathcal{C}$ is a real beast, and it is the thing to understand about GRF's. It defines the dispersive properties of the random process, i.e. the direction and strength of correlations.
Note: I'm not a mathematician, nor well-versed in FEM literature. Therefore, there are bound to be a errors in this notebook. Please consult the references below for rigorous mathematical details. However, I do find that in this notebook the actual content is a little less convoluted for the average joe, MSc.
Number | Author | Title | Year | Journal/Proceedings | Reftype | DOI/URL |
---|---|---|---|---|---|---|
1 | Bui-Thanh, T., Ghattas, O., Martin, J. and Stadler, G. | A computational framework for infinite-dimensional Bayesian inverse problems, Part I: The linearized case, with application to global seismic inversion | 2013 | SIAM Journal on Scientific ComputingVol. 35(6), pp. A2494-A2523 | article | link |
2 | Bui-Thanh, T. and Nguyen, Q.P. | FEM-based discretization-invariant MCMC methods for PDE-constrained Bayesian inverse problems | 2016 | Inverse Problems and ImagingVol. 10(4), pp. 943-975 | article | link |
3 | Stuart, A.M. | Inverse problems: a Bayesian perspective | 2010 | Acta numericaVol. 19, pp. 451-559 | article | |
4 | Le Dret, H. and Lucquin, B. | Partial differential equations: modeling, analysis and numerical approximation | 2016 | Vol. 168 | book | |
Abrahamsen, P. | Gaussian Random Fields and Correlation Functions | 1997 | article | |||
Beskos, A., Pinski, F.J., Sanz-Serna, J.M. and Stuart, A.M. | Hybrid monte carlo on hilbert spaces | 2011 | Stochastic Processes and their ApplicationsVol. 121(10), pp. 2201-2230 | article | ||
Hinze, M., Pinnau, R., Ulbrich, M. and Ulbrich, S. | Optimization with PDE constraints | 2008 | Vol. 23 | book | ||
Paciorek, C.J. and Schervish, M.J. | Nonstationary covariance functions for Gaussian process regression | 2004 | Advances in neural information processing systems, pp. 273-280 | inproceedings |
Now before we dive into $\mathcal{C}$ I'd like to introduce the characteristic function of a GRF. It is defined as: $$ \begin{align} \chi(u) :=& \frac{1}{2} \left|\left| \left( u - \mu_0\right) \right|\right|^2_{\mathcal{C}}\\ =& \frac{1}{2} \left|\left| \mathcal{C}^{-\frac{1}{2}}\left( u - \mu_0\right) \right|\right|^2_{L^2\left( \Omega \right)}\\ =& \frac{1}{2}\langle \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right) , \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right) \rangle\\ =& \frac{1}{2}\int_\Omega \left( \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right)\right)^T \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right) d\mathbf{x}\\ =& \frac{1}{2} \int_\Omega \left( u - \mu_0 \right)^T (\mathcal{C}^{-\frac{1}{2}})^T \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right) d\mathbf{x}\\ =& \frac{1}{2}\int_\Omega \left( u - \mu_0\right)^T \mathcal{C}^{-\frac{1}{2}} \mathcal{C}^{-\frac{1}{2}} \left( u - \mu_0\right) d\mathbf{x}\\ =& \frac{1}{2}\int_\Omega \left( u - \mu_0\right)^T \mathcal{C}^{-1} \left( u - \mu_0\right) d\mathbf{x}\\ \end{align} $$ This thing is also called the Cameron-Martin norm (of the function $\left[u - \mu_0\right]$) associated with $\mathcal{C}$. The mean function has no role in the Cameron-Martin norm, only in the misfit function.
Choosing an appriopriate $\mathcal{C}$ is serious business. It has to satisfy a few properties to make the mathematics work out. For starters, because $\mathcal{C}$ is a measure, it needs to be positive definite. It should also be self-adjoint. Additionaly, it's covariance functions (we'll get to these) need to be bounded. There is a bunch more, described in [1], [2], and most completely in [3]. We'll analyse a few below.
For an in-depth analysis of General second order elliptic problems and their variational formulations I refer to page 96 of [4].
One nice choice of $\mathcal{C}$ is $\mathcal{C} = \mathcal{A}^{-2}$, where $\mathcal{A}$ is a Laplacian like operator. The Laplacian-like operator used in [1] and this notebook is: $$ \begin{align} \mathcal{A} = -\alpha \nabla \cdot \left( \Theta \nabla \right) + \alpha \end{align} $$ with the Neumann boundary condition on the field u: $$ \begin{align} \alpha \left( \Theta \nabla u \right) \cdot \mathbf{n} = 0, \quad \text{on } \partial \Omega. \end{align} $$ The nice property is that we can allow $\Theta$ to be an non-stationary operator that encodes spatially varying dispersion and anisotropy. A great example $\Theta(\mathbf{x})$ for velocity distribution prior is given in [1], which has been replicated in this notebook (further down).
With the chosen $\mathcal{A}$ the covariance operator becomes: $$ \begin{align} \mathcal{C} =& (-\alpha \nabla \cdot \left( \Theta \nabla \right) + \alpha)^{-2}. \end{align} $$ The resulting operator is stable for at most 3 spatial dimensions, as demonstrated in [2].
Another operator that is not implemented in this notebook is the generalization of $\mathcal{A}_1$ in the case that $\Theta = \mathbf{I}$, as proposed in [2]: $$ \begin{align} \mathcal{C} =& \alpha^{-1} \left(I - ∆\right)^{-s}\\ =& \alpha^{-1} \mathcal{A}^{-s} \end{align} $$ with the Neumann boundary condition on the field u: $$ \begin{align} \frac{\partial u}{\partial n} = 0, \quad \text{on } \partial \Omega. \end{align} $$ These operators have the nice property the the measure is well behaved as long as $s > d/2$, with $d$ the spatial dimension. However, in this operator there is no non-stationary component.
The PDE associated with the covariance operator $\mathcal{C}$ from [1] is given as: $$ \mathcal{A}u = f $$ where f is a arbitrary forcing term. The boundary condition is still the same: $$ \begin{align} \alpha \left( \Theta \nabla u \right) \cdot \mathbf{n} = 0, \quad \text{on } \partial \Omega. \end{align} $$
Solving this PDE for different forcing terms is very insightful. By applying Delta function we can for example compute the covariance functions. By applying Gaussian white noise as the forcing, we can generate samples from the distribution. Both are demonstrated in this notebook.
To be written
Decaying eigenvalues, minimal variance length, spectral decomposition, FEM discretization, infinite eigenfunctions.
We'll be solving PDE's using finite elements on some arbitrary domain in at most $\mathbb{R}^3$. We use the FEM package Fenics and its associated meshing package mshr. Additionally, we'll import SciPy for sparse matrix algebra, NumPy for random number generation and linear algebra, and Matplotlib for plotting.
# FEM tools
import fenics
import mshr
# Linear algebra tools
import numpy as np
from scipy.sparse import csr_matrix, csc_matrix
from scipy import linalg
from scipy.sparse.linalg import spsolve
# Plotting tools
import matplotlib.pyplot as plt
import matplotlib.tri as tri
from IPython import display
# A function to extract mesh coordinates into triangulation objects
def mesh2triang(mesh):
xy = mesh.coordinates()
return tri.Triangulation(xy[:, 0], xy[:, 1], mesh.cells())
To start, we'll use mshr to create a circular domain and mesh it rather fine.
# Create mesh and plot
domain = mshr.Circle(fenics.Point(0, 0), 1)
mesh = mshr.generate_mesh(domain, 15)
# Let's refine a half of the domain
cell_markers = mshr.dolfin.MeshFunction("bool", mesh, 2) # Create a boolean function on the mesh elements
cell_markers.set_all(False) # Set all the values to false
for cell in fenics.cells(mesh): # Loop over the cells
coor_midpoint = cell.midpoint() # Find the midpoint
if coor_midpoint[0] > 0: # if the x-midpoint is larger than 0 ...
cell_markers[cell] = True # ... set true
mesh = fenics.refine(mesh, cell_markers) # refine where function is true
# Let's refine another quarter of the domain
cell_markers = mshr.dolfin.MeshFunction("bool", mesh, 2)
cell_markers.set_all(False)
for cell in fenics.cells(mesh):
coor_midpoint = cell.midpoint()
if coor_midpoint[0] > 0 and coor_midpoint[1] > coor_midpoint[0]: # if the x-midpoint > 0 AND y-midpoint > 0 ...
cell_markers[cell] = True # ... set true
mesh = fenics.refine(mesh, cell_markers)
fig = plt.figure(figsize=(10, 10), dpi=80, facecolor="w", edgecolor="k")
lines = fenics.plot(mesh)
plt.xlabel("x")
plt.ylabel("y")
Text(0, 0.5, 'y')
To solve the elliptic PDE associated with the covariance operator we need to create its variational form. For the operator from [1] the variational form is given as: $$ \alpha \int_\Omega \left( \Theta \nabla u \right) \cdot \nabla v+ \alpha \, u \, v \, dx = \int_\Omega f \, v \, dx $$ with trial function $u$ and test function $v$. Identifying the linear and bilinear forms as respectively the RHS and LHS make this expression easily solvable using Fenics.
We'll start by defining the function space and test/trial functions to live in this space. This directly defines the shape functions used in the FEM solve.
V = fenics.FunctionSpace(mesh, "CG", 1)
u = fenics.TrialFunction(V)
v = fenics.TestFunction(V)
Next, we enter expressions for the linear and bilinear form. $\Theta$ is taken from [1], but rescaled for the unit circle, and only 2 dimensional.
# Create the operator theta as a space-dependent tensor operation
theta = fenics.Expression(
(
(
"Beta*(1 - x[0] * x[0] * (1-Theta) * ( 2 * pow((x[0] * x[0] + x[1] * x[1]),0.5) - (x[0] * x[0] + x[1] * x[1])) / (x[0] * x[0] + x[1] * x[1]))",
"Beta*( - x[0] * x[1] * (1-Theta) * ( 2 * pow((x[0] * x[0] + x[1] * x[1]),0.5) - (x[0] * x[0] + x[1] * x[1])) / (x[0] * x[0] + x[1] * x[1]))",
),
(
"Beta*( - x[1] * x[0] * (1-Theta) * ( 2 * pow((x[0] * x[0] + x[1] * x[1]),0.5) - (x[0] * x[0] + x[1] * x[1])) / (x[0] * x[0] + x[1] * x[1]))",
"Beta*(1 - x[1] * x[1] * (1-Theta) * ( 2 * pow((x[0] * x[0] + x[1] * x[1]),0.5) - (x[0] * x[0] + x[1] * x[1])) / (x[0] * x[0] + x[1] * x[1]))",
),
),
degree=1,
Theta=fenics.Constant(0.0004),
Beta=fenics.Constant(10 ** -0),
)
alpha = fenics.Constant(1.0)
# fenics.dx creates a domain integral, fenics.ds creates a surface integral
bilinear_component = alpha * (fenics.dot(theta * fenics.grad(u), fenics.grad(v)) + u * v) * fenics.dx
linear_component = fenics.Constant(0) * v * fenics.dx
Now that the operators are constructed, we could solve the PDE. However, to compute the Cameron-Martin norm and to generate samples we need the actual matrices, i.e. the discretised versions of our operators on the mesh. Luckily, Fenics allows us to extracts these objects.
The stiffness matrix we can directly construct from the operators bilinear and linear form. Note that this generates also the forcing vector, but we have not added any sources to our PDE, yet!
%%time
stiffness_matrix, forcing_vector = fenics.assemble_system(bilinear_component, linear_component)
CPU times: user 11.3 ms, sys: 61 µs, total: 11.4 ms Wall time: 11.3 ms
The mass matrix is constructed by multiplying the trial and test function.
mass_matrix = fenics.assemble(u * v * fenics.dx)
We now get the actual data behind these matrices (petsc4py matrices).
K = fenics.as_backend_type(stiffness_matrix).mat()
M = fenics.as_backend_type(mass_matrix).mat()
f = fenics.as_backend_type(forcing_vector).vec()
Convert to something we can actually use, i.e. SciPy sparse matrices. See this SciPy reference to constructing CSR matrices. We could also use PETSc subroutines from petsc4py, but I'm not too familiar with the framework and no API reference manual exists as the time of writing. This document is the best I found on it: auto-generated petsc4py reference.
Ka, Kb, Kc = K.getValuesCSR()
stiffness_matrix_sparse = csr_matrix((Kc, Kb, Ka), K.size)
Ka, Kb, Kc, K = (None, None, None, None) # Freeing up space
Ma, Mb, Mc = M.getValuesCSR()
mass_matrix_sparse = csr_matrix((Mc, Mb, Ma), M.size)
Ma, Mb, Mc, M = (None, None, None, None) # Freeing up space
Inspecting the discretised operator matrices to show us sparsity!
fig = plt.figure(figsize=(20, 10), dpi=80, facecolor="w", edgecolor="k")
if mass_matrix_sparse.shape[0] < 500:
# Don't do this for large systems
# Stiffness matrix
plt.subplot(121)
stiffness_matrix_dense = stiffness_matrix_sparse.todense()
extremum = np.max(np.abs(stiffness_matrix_dense))
plt.imshow(stiffness_matrix_dense, vmin=-extremum, vmax=extremum)
plt.title("Stiffness matrix", size=30)
plt.colorbar()
# Mass matrix
plt.subplot(122)
mass_matrix_dense = mass_matrix_sparse.todense()
extremum = np.max(np.abs(mass_matrix_dense))
plt.imshow(mass_matrix_dense, vmin=0, vmax=extremum)
plt.title("Mass matrix", size=30)
plt.colorbar()
else:
plt.subplot(121)
plt.spy(stiffness_matrix_sparse, markersize=1)
plt.title("Stiffness matrix (sparse)", size=30)
plt.subplot(122)
plt.spy(mass_matrix_sparse, markersize=1)
plt.title("Mass matrix (sparse)", size=30)
Remember that we have not applied a source term yet to the PDE, which should be reflected in the forcing vector:
plt.plot(forcing_vector.get_local())
plt.xlabel("Forcing vector component")
plt.xlabel("Forcing vector component value")
Text(0.5, 0, 'Forcing vector component value')
We can now start to solve our PDE. However, we have two ways of doing this. First, we have all our Fenics machinery, which are nice wrappers around the linear systems with handy functions if we'd actually be interested in the diffusion problem. Secondly, we just extracted all these nice matrices, and we could just start to solve systems like $Ku = f$ directly using linear algebra.
We'll see that both are useful. One nice feauture of Fenics is that it implements various source terms, which it automatically discretizes appropriately for the given problem. This is especially nice when working with sources like Dirac Delta's. Therefore, we'll use Fenics' machinery to solve for the Green's functions in the following section.
However, for generating samples the source term will need to be a special kind of spatial white noise (we'll discuss it later). Fenics does not have the appropriate mathematical source, and I'm not able to dive into the specifics of the forcing vector of Fenics. To still solve the system, we'll do a manual solve on this system, using either NumPy or SciPy. Additionaly, the matrices we extracted for this are essential in calculating the Cameron-Martin norm of a function.
The covariance functions are not directly defined, only through the covariance operator. But investigating the covariance functions for a few points is very enlightening.
The definition of the covariance function is the following: $$ \begin{align} (\mathcal{C} u) (\mathbf{x}) = \int_\Omega c(\mathbf{x}, \mathbf{x'} )\: u( \mathbf{x'} ) d \mathbf{x'} \end{align} $$ That's a Green's function!
Yup, and that makes investigating these rather 'easy'. We simply inject a dirac source into our elliptic PDE and let the FEM package compute the solution. What we see then is the covariance function w.r.t. the injection point.
Let's give it a try for a few different points.
fig = plt.figure(figsize=(10, 10), dpi=80, facecolor="w", edgecolor="k")
# A few different Greens function source locations
y = [0, 0.25, 0.75, 0.9]
for i in range(y.__len__()):
# Create the solution function
solution = fenics.Function(V)
# Create a new RHS vector for every solve.
forcing_vector_copy = forcing_vector.copy()
# Create the source (a Dirac Delta, for the Green's function)
DiracSource = fenics.PointSource(V, fenics.Point(0.0, y[i]), 1.0)
# Add the dirac function to the forcing vector. Note that if this is applied consecutively on
# the same object, two sources are added! That's why we make a copy of the forcing_vector.
DiracSource.apply(forcing_vector_copy)
# Solve the PDE
fenics.solve(stiffness_matrix, solution.vector(), forcing_vector_copy)
# Compute values on mesh
mesh_values = solution.compute_vertex_values(mesh)
# Plot the Green's functions
plt.subplot(int(y.__len__()/2), 2, i+1)
plt.gca().set_aspect('equal')
image = plt.tripcolor(mesh2triang(mesh), mesh_values, shading='gouraud', cmap=plt.get_cmap("pink"))
image.set_clim(vmin=0)
plt.colorbar()
plt.axis('off')
plt.tight_layout()
plt.show()
As you might be able to see, although the covariance operators look generally symmetric, the upper right quadrant actually has the most resolution, due to the finer grid. This is okay if the mesh atleast resolves the dominant eigenvalues / variations.
If we now inspect the forcing vector of the PDE corresponding to the last Green's function we'll see how the pulse is implemented:
plt.plot(forcing_vector_copy.get_local())
[<matplotlib.lines.Line2D at 0x7f9d282b3940>]
I think that the authors of [1] have created a very nice and useable prior using $\Theta$; we see that the covariance near the center of the domain looks like simple Gaussian blobs, while at the surface the response becomes very layered. Note however, that the absolute value of the Green's function increases (i.e. the variance), according to the authors as a result of the boundary condition. What this means for the prior is that we can expect larger variations at the surface. We could correct for this using a spatially varying $\alpha$.
Now, to generate samples we need to 'randomly' force the PDE. We can do this with scale-invariant Gaussian noise, also called a Gaussian free field. A few other names for this field (if you want to Google) are massless free field or Euclidean bosonic massless free field. If the GFF is subject to boundary conditions $u = 0$, then they are the n-dimensional extension of Brownian bridges. Generated GFF fields are used as the source term in the PDE.
Now this may seem like an abstract thing, but what we do is use the FEM mass matrix $M$ as a covariance matrix. To now generate samples from random spatial noise we need it's matrix square root $L$, s.t. $L^2 = M$. This is the main expensive operation of GRF sample generation!
I'm not sure how to avoid this operation. At least in the spectral element method, this is cheap, as $M$ would be diagonal. Generally, $L$ will not be cheap to compute!
%%time
massSqrt = linalg.sqrtm(mass_matrix_sparse.todense())
CPU times: user 1min 9s, sys: 29.4 s, total: 1min 39s Wall time: 8.94 s
After applying the spatial white noise as a RHS term we still need to solve the PDE. This can be done in a number of ways (as discussed in the next subsession Generating a sample. One of these is to construct the combined operator for generating spatial white noise and solving the PDE. We'll see that this method turns out to be the fastest way of sample generation, HOWEVER it does require us to generate one additional dense matrix in addition to the square root mass matrix. Computing the combined operator can either be done with direct inversion or a sparse solve, the latter being faster in most cases.
Again, in the spectral element method, the combined operator would be sparse, and therefore even faster to apply.
%%time
combined_operator_1 = linalg.inv(stiffness_matrix_sparse.todense()) @ massSqrt
CPU times: user 12.5 s, sys: 1.55 s, total: 14 s Wall time: 1.28 s
%%time
combined_operator = spsolve(stiffness_matrix_sparse, massSqrt)
CPU times: user 787 ms, sys: 538 ms, total: 1.32 s Wall time: 414 ms
Checking the sparsity and equivalence of the resulting operators:
print(
"Are the resulting operators numerically equal: %s."
% ("yes" if np.allclose(combined_operator, combined_operator_1) else "no")
)
print("Number of non zero entries: %i" % np.nonzero(combined_operator)[0].size)
print("Number of non total entries: %i" % np.prod(combined_operator.shape))
Are the resulting operators numerically equal: yes. Number of non zero entries: 4145296 Number of non total entries: 4145296
Now to actually generate a sample! To reitare we need to:
We can do this in 3 ways:
Let's try all these methods. First we'll use the same L2 random vector to see if we actually create the same samples using all three methods.
random_sample = np.random.randn(combined_operator.shape[0])
# Generate spatial white noise
forcing_vector.set_local(
massSqrt @ random_sample
) # replace with numpy function for actually random samples
# Create sample function
sample_method1 = fenics.Function(V)
# Solve PDE with random forcing
%timeit sample_method1.vector().set_local(spsolve(stiffness_matrix_sparse, forcing_vector.get_local()));
5.17 ms ± 95.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Generate spatial white noise
forcing_vector.set_local(massSqrt @ random_sample) # replace with numpy function for actually random samples
# Create sample function
sample_method3 = fenics.Function(V)
# Solve PDE with random forcing
%timeit fenics.solve(stiffness_matrix, sample_method3.vector(), forcing_vector)
5.38 ms ± 80.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# Generate L2 white noise
l2_white_noise = random_sample # replace with numpy function for actually random samples
# Create sample function
sample_method2 = fenics.Function(V)
# Apply the combined spatial white noise and stiffness operator
%timeit sample_method2.vector().set_local(combined_operator @ l2_white_noise)
1.7 ms ± 135 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Whew, what a difference (on my machine)! It seems that the combined operator is the way to go, as it avoids operations that can be precomputed. If we would need to generate many samples, this can be a life saver. Let's see if the samples are equal.
fig = plt.figure(figsize=(10, 8), dpi=80, facecolor="w", edgecolor="k")
plt.subplot(231)
fenics.plot(sample_method1)
plt.xlabel("x")
plt.ylabel("y")
plt.title("Sample method 1")
plt.subplot(232)
fenics.plot(sample_method2)
plt.xlabel("x")
plt.ylabel("y")
plt.title("Sample method 2")
plt.subplot(233)
fenics.plot(sample_method3)
plt.xlabel("x")
plt.ylabel("y")
plt.title("Sample method 3")
plt.subplot(212)
diff12 = np.abs(
sample_method1.vector().get_local() - sample_method2.vector().get_local()
)
diff13 = np.abs(
sample_method1.vector().get_local() - sample_method3.vector().get_local()
)
plt.semilogy(diff12, "-r", label="sample 1 vs 2")
plt.semilogy(diff13, "-k", label="sample 1 vs 3")
plt.xlabel("Parameter node")
plt.ylabel("Difference")
plt.legend()
plt.title("Sample difference per node")
plt.tight_layout()
print(
"Are all the samples numerically equal: %s."
% (
"yes"
if np.allclose(
sample_method1.vector().get_local(), sample_method2.vector().get_local()
)
and np.allclose(
sample_method1.vector().get_local(), sample_method3.vector().get_local()
)
and np.allclose(
sample_method2.vector().get_local(), sample_method3.vector().get_local()
)
else "no"
)
)
Are all the samples numerically equal: yes.
%%time
# Create the sample function
sample = fenics.Function(V)
fig, axes = plt.subplots(2, 2, figsize=(10, 8), dpi=80, facecolor="w", edgecolor="k")
axes = axes.flatten()
misfits1 = []
misfits2 = []
for i in range(10):
for j in range(4):
sample.vector().set_local(combined_operator @ np.random.randn(combined_operator.shape[0]))
C = sample.compute_vertex_values(mesh)
sample_vector = sample.vector().get_local()
operator_applied_vector = stiffness_matrix_sparse @ sample_vector
misfits1.append(np.linalg.norm(operator_applied_vector))
misfits2.append(np.linalg.norm(mass_matrix_sparse @ operator_applied_vector))
axes[j].tripcolor(
mesh2triang(mesh),
C,
shading="gouraud",
cmap=plt.get_cmap("RdBu"),
vmin=-2,
vmax=2,
)
axes[j].set_aspect("equal")
axes[j].axis("off")
plt.tight_layout()
display.clear_output(wait=True)
display.display(plt.gcf())
display.clear_output(wait=True)
CPU times: user 21.9 s, sys: 13.5 s, total: 35.4 s Wall time: 11.4 s
plt.subplot(1,2,1)
plt.hist(misfits1, 100);
plt.subplot(1,2,2)
plt.hist(misfits2, 100);
Now we have said that GRFs make the inverse problem mesh independent. However, the mesh does need to fulfill some requirements. At least, it should resolve the eigenfunctions corresponding to the $n$ largest eigenvalues, otherwise we won't see these variations in generated samples. This is evident when we use a very coarse grid. Let's go through the same process with light speed.
A rather 'shitty' mesh:
mesh_lowres = mshr.generate_mesh(domain, 5)
fenics.plot(mesh_lowres)
[<matplotlib.lines.Line2D at 0x7f9d186be358>, <matplotlib.lines.Line2D at 0x7f9d186be748>]
Setting up the PDE:
V_lowres = fenics.FunctionSpace(mesh_lowres, "CG", 1)
u_lowres = fenics.TrialFunction(V_lowres)
v_lowres = fenics.TestFunction(V_lowres)
# Theta and alpha remain unchanged, as they do no incorporate the function space or the mesh
# fenics.dx creates a domain integral, fenics.ds creates a surface integral
bilinear_component_lowres = (
alpha
* (
fenics.dot(theta * fenics.grad(u_lowres), fenics.grad(v_lowres))
+ u_lowres * v_lowres
)
* fenics.dx
)
linear_component_lowres = fenics.Constant(0) * v_lowres * fenics.dx
Assembling the matrices for the low resolution mesh:
stiffness_matrix_lowres, forcing_vector_lowres = fenics.assemble_system(
bilinear_component_lowres, linear_component_lowres
)
mass_matrix_lowres = fenics.assemble(u_lowres * v_lowres * fenics.dx)
K_lowres = fenics.as_backend_type(stiffness_matrix_lowres).mat()
M_lowres = fenics.as_backend_type(mass_matrix_lowres).mat()
Ka, Kb, Kc = K_lowres.getValuesCSR()
stiffness_matrix_sparse_lowres = csr_matrix((Kc, Kb, Ka), K_lowres.size)
Ka, Kb, Kc, K = (None, None, None, None) # Freeing up space
Ma, Mb, Mc = M_lowres.getValuesCSR()
mass_matrix_sparse_lowres = csr_matrix((Mc, Mb, Ma), M_lowres.size)
Ma, Mb, Mc, M = (None, None, None, None) # Freeing up space
Precompute the combined operator:
operator_lowres = linalg.inv(stiffness_matrix_sparse_lowres.todense()) @ linalg.sqrtm(mass_matrix_sparse_lowres.todense())
And sampling:
# Create the sample function
sample_lowres = fenics.Function(V_lowres)
fig, axes = plt.subplots(2, 2, figsize=(10, 8), dpi=80, facecolor="w", edgecolor="k")
axes = axes.flatten()
for i in range(5):
for j in range(4):
plt.subplot(2, 2, j + 1)
sample_lowres.vector().set_local(operator_lowres @ np.random.randn(operator_lowres.shape[0]))
C_lowres = sample_lowres.compute_vertex_values(mesh_lowres)
plt.tripcolor(
mesh2triang(mesh_lowres),
C_lowres,
shading="gouraud",
cmap=plt.get_cmap("RdBu"),
vmin=-2,
vmax=2,
)
plt.gca().set_aspect("equal")
plt.axis("off")
plt.tight_layout()
display.clear_output(wait=True)
display.display(plt.gcf());
display.clear_output(wait=True)
As you can see, the small scale variations that were allowed to exist on the finer mesh are not visible on these coarse meshes, only long-wavelength characteristics are preserved. H